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Gopal Gupta, Enrico Pontelll, Khayri A.M. All, Mats Carlsson, Manuel V. Hernaeneglldo 
"V^ July 2001 ACM Transactions on Programming Languages and Systems (TOPLAS), voi 

Publisher: ACM Press 

Full text available: "^.pMI SG-MBJ Additional Information: MLcitation, abstract, reMenQes, ? 

Since the early days of logic programnning, researchers in the field realized the potential for exf 
of logic progranns. Their high-level nature, the presence of nondeterminism, and their referentic 
make logic programs interesting candidates for obtaining speedups through parallel execution. < 
applications of logic programming frequently involve irregular computatio ... 


Keywords: Automatic parallelization, constraint programming, logic programming, parallelism. 
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T. Fiebig, S. Helmer, C.-C. Kanne, G. Moerkotte, J. Neumann, R. Schiele, T. Westmann 
December 2002 The VLDB Journal — The International Journal on Very Large Data Bases, 
Publisher: Springer-Verlag New York, Inc. 

Full text available: ■^pdf(.30Q.J7.KBj Additional Information: fcfJLcJtatlori. .abstract, cltinas, jnde; 

Several alternatives to manage large XML document collections exist, ranging from file systems 
specifically tailored XML base management systems. In this paper we give a tour of Natix, a dal 
scratch for storing and processing XML data. Contrary to the common belief that management < 
traditional databases like relational systems, we illustrate how almost every component in a ... 

Keywords: Database, XML 


The V-Way Cache: Demand Based Associativity via Global Replacement 
Moinuddin K. Qureshi, David Thompson, Yale N. Patt 

May 2005 ACM SIGARCH Computer Architecture News , Proceedings of the 32nd Anr 

Architecture ISCA '05, volume 33 issue 2 

Publisher: IEEE Computer Society, ACM Press 

Full text available: '^p„dfC23J„J3,KBJ Additional Information: fiijL citation, abstract:., indexjexms 

As processor speeds increase and memory latency becomes more critical, intelligent design and 
increasingly important. The efficiency of current set-associative caches is reduced because prog 
memory accesses across different cache sets. We propose a technique to vary the associativity 
demands of the program. By increasing the number of tag-store entries relative to the ... 
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Luca Benini, Giovanni de Micheli 

April 2000 ACM Transactions on Design Automation of Electronic Systems (TODAES), 



Publisher: ACM Press 

Full text available: ^.Dd^t3M.2.2 KB) Additional Information: M citation, abstract. reMences, ; 

This tutorial surveys design methods for energy-efficient system-level design. We consider elec 
and software layers. We consider the three major constituents of hardware that consume energ 
storage units, and we review methods of reducing their energy consumption. We also study mo 
and methods for energy-efficient software design and compilation. This survery ... 
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Raksit Ashok, Saurabh Chheda, Csaba Andras Moritz 

May 2004 ACM Transactions on Computer Systems (TOCS), volume 22 issue 2 

Publisher: ACM Press 

Full text available: ' gDdf(l.41 MB) Additional Information: full citation, abstract, references, J 

This article presents Cool-Mem, a family of memory system architectures that integrate conven 
aware address translation, and compiler-enabled cache disambiguation techniques, to reduce ei 
architectures. The solutions provided in this article leverage on interlayer tradeoffs between arc 
layers. Cool-Mem achieves power reduction by statically matching memory operations with ene 

Keywords: Energy efficiency, translation buffers, virtually addressed caches 
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Mural! Annavaram, Jignesh M. Patel, Edward S. Davidson 

November 2003 ACM Transactions on Computer Systems (TOCS), volume 21 issue 4 
Publisher: ACM Press 

Full text available: 'fl^pdfCZOlJl.KBj Additional Information: fujLcjtatlpn, abstract, Merences, : 

With the continuing technological trend of ever cheaper and larger memory, most data sets in c 
main memory. In this configuration, the performance bottleneck is likely to be the gap between 
memory access latency. Previous work has shown that database applications have large instruc 
processor caches effectively. In this paper, we propose Call Graph Prefetching (CGP), ... 

Keywords: Instruction cache prefetching, call graph, database 
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G. Edward Suh, Dwaine Clarke, Blaise Gassend, Marten van Dijk, Srinivas Devadas 

June 2003 Proceedings of the 17th annual international conference on Supercomputi 

Publisher: ACM Press 

Full text available: '^.pdf(286..aQ. KBi Additional Information: MLcMion, abstract, refsiences, ; 

We describe the architecture for a single-chip aegis processor which can be used to build compi 
software attacks. Our architecture assumes that all components external to the processor, such 
different implementations. In the first case, the core functionality of the operating system is tru 
also describe a variant implementation assuming an untrusted operating s ... 

Keywords: certified execution, secure processors, software licensing 


Externai memory aigorithms and data structures: dealing with massive data 
Jeffrey Scott Vitter 

June 2001 ACM Computing Surveys (CSUR), volume 33 issue 2 

Publisher: ACM Press 

Full text available: '^pdf(82M6.KBj Additional Information: ftiJlcrtatipn, abstract, references, : 

Data sets in large applications are often too massive to fit completely inside the computers inte 
communication (or I/O) between fast internal memory and slower external memory (such as di: 
this article we survey the state of the art in the design and analysis of external memory (or EM; 
is to exploit locality in order to reduce the I/O costs. We consider a varie ... 

Keywords: B-tree, I/O, batched, block, disk, dynamic, extendible hashing, external memory, \ 
methods, multilevel memory, online, out-of-core, secondary storage, sorting 


^ Cooj-Mem: combining statically specuiative memoiy accessing with selective address tran 
Rakslt Ashok, Saurabh Chheda, Csaba Andras Moritz 
^ October 2002 ACM SIGOPS Operating Systems Review , ACM SIGPLAN Notices , ACM SIC 
Proceedings of the 10th international conference on Architectural support 

systems ASPLOS-X, volume 36 , 37 , 30 Issue 5 , 10 , 5 

Publisher: ACM Press 

Full text available: '^pdfilMJMl Additional Information: fuji dtation, abstract, Merences, s 

This paper presents Cool-Mem, a family of memory system architectures that Integrate convent 
aware address translation, and compiler-enabled cache disambiguation techniques, to reduce ei 
architectures. It combines statically speculative cache access modes, a dynamic CAM based Tac 
mispredicted accesses, various conventional multi-level associative cache organizations, embed 

10 Information flow inference for free 

Frangois Pottier, Sylvain Conchon 
^ September 2000 ACM SIGPLAN Notices , Proceedings of the fifth ACM SIGPLAN internation 

ICFP '00, Volume 35 Issue 9 

Publisher: ACM Press 

Full text available: gpdrf 749.77 KB) Additional Information: tliii citation , abstract , references, j 

This paper shows how to systematically extend an arbitrary type system with dependency infori 
interference proofs for the new system may rely upon, rather than duplicate, the soundness pre 
virtually any of the type systems known today with information flow analysis, while requiring or 
on an untyped operational semantics for a labelled calculus akin to core ML. Thus, it i ... 
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Jih-Kwon Peir, Shih-Chang Lai, Shih-Lien Lu, Jared Stark, Konrad Lai 
^ June 2002 Proceedings of the 16th international conference on Supercomputing 
Publisher: ACM Press 

Full text available: ^pdf{248,57.KB) Additional Information: fgji.cjtatipn, .abstract, references, i 

A processor must know a load instruction's latency to schedule the load's dependent Instruction 
processors do not know this latency until well after the dependent instructions should have bee 
themselves and the load. One solution to this problem is to predict the load's latency, by predic 
cache. Existing cache hit/miss predictors, however, can only correctly ... 

Keywords: bloom filter, data cache, data prefetching, data speculation, instruction scheduling 


Level set and PDE methods for computer graphics 

David Breen, Ron Fedkiw, Ken Museth, Stanley Osher, Guillermo Sapiro, Ross Whitaker 
^ August 2004 Proceedings of the conference on SIGGRAPH 2004 course notes GRAPH 'O 
Publisher: ACM Press 

Full text available: "^.DdfClZ.OLMB) Additional Information: MLcitatlon, attract 

Level set methods, an important class of partial differential equation (PDE) methods, define dyr 
surface) of a sampled, evolving nD function. The course begins with preparatory material that ii 
equations to solve problems in computer graphics, geometric modeling and computer vision. Th 
several different types of differential equations, e.g. the level set eq ... 

£3si.detection^ 

Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced Studies or 
Publisher: IBM Press 

Full text available: ^.pdf(4,21MBJ Additional Information: M oMon, abstract, mferences, j 

Understanding distributed applications is a tedious and difficult task. Visualizations based on pn 
better understanding of the execution of the application. The visualization tool we use is Poet, a 
Waterloo. However, these diagrams are often very complex and do not provide the user with th 
experience, such tools display repeated occurrences of non-trivial commun ... 
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Norman P. Jouppi 

May 1990 ACM SIGARCH Computer Architecture News , Proceedings of the 17th ann 

Architecture ISCA '90, Volume is issue 3a 
Publisher: ACM Press 

Full text available: ^ o6n^.20 MB) Additional Information: full cftation. abstract , references, j 

Projections of computer technology forecast processors with peak performance of 1,000 MIPS ir 
could easily lose half or more of their performance in the memory hierarchy if the hierarchy des 
This paper presents hardware techniques to improve the performance of caches. Miss caching p 
cache and its refill path. Misses in the ca ... 

Fast address lookups using controlled prefix expansion 
v. Srinivasan, G. Varghese 

February 1999 ACM Transactions on Computer Systems (IOCS), Volunne 17 Issue 1 
Publisher: ACM Press 

Full text available: ' gpdf(258.50 KB) Additional Information: m crtation. abstfaci , references, j 

Internet (IP) address lookup is a major bottleneck in high-performance routers. IP address look 
matching prefix lookup. It is compounded by increasing routing table sizes, increased traffic, hii 
IPv6 addresses. We describe how IP lookups and updates can be made faster using a set of of t 
controlled prefix expansion, transf ... 

Keywords: Internet address lookup, binary search on levels, controlled prefix expansion, expa 
router preformance 


Let caches decay: reducing leakage energy via exploitation of cache generational behavio 
Zhigang Hu, Stefanos Kaxiras, Margaret Martonosi 

May 2002 ACM Transactions on Computer Systems (TOCS), Volume 20 Issue 2 

Publisher: ACM Press 

Full text available: '^pd f(673.Q3 KB) Additional Information: full cltstion, abstract , references , j 

Power dissipation is increasingly important in CPUs ranging from those intended for mobile use, 
for highend servers. Although the bulk of the power dissipated is dynamic switching power, leal 
Chipmakers expect that in future chip generations, leakage's proportion of total chip power will 
methods for reducing leakage power within the cache memories of the CPU. Be ... 

Keywords: Cache memories, cache decay, generational behavior, leakage power 


Automatic tilina of iterative stencil loops 
Zhiyuan Li, Yonghong Song 

November 2004 ACi^l Transactions on Programming Languages and Systems (TOPLAS), Vol 
Publisher: ACM Press 

Full text available: ^ pdr(947.59 KB) Additional Information: full citation , abstract; references. J 

Iterative stencil loops are used in scientific programs to implement relaxation methods for numi 
loops iteratively modify the same array elements over different time steps, which presents opp( 
temporal data locality through loop tiling. This article presents a compiler framework for autom; 
objective of improving the cache performance. The article first presents a ... 

Keywords: Caches, loop transformations, optimizing compilers 


Multithreading i: Pointer cache assisted prefetching 
Jamison Collins, Suleyman Sair, Brad Calder, Dean M. Tullsen 

November 2002 Proceedings of the 35th annual ACM/IEEE international symposium on Mi< 
Publisher: IEEE Computer Society Press 

Full text available: ^ pdgi .21MB1^ Additional Information: full citation, abstract , references , = 

Data prefetching effectively reduces the negative effects of long load latencies on the performai 
employ hardware structures to predict future memory addresses based on previous patterns. Tl 
actual program code to determine future load addresses for prefetching.This paper proposes th( 


transitions, to aid prefetching. The pointer cache provides, for a given pointer's ... 
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Steven Hsu, Shih-Lien Lu, Shili-Chang Lai, Ram Krishnannurthy, Konrad Lai 

November 2002 Proceedings of the 35th annual ACM/IEEE international symposium on Mi< 
Publisher: IEEE Computer Society Press 

Full text available: m :)/jf(9()7 32 kb> W. Publisher Siie Additional Information: full cftation. abstract, references, j 


As pipeline width and depth grow to improve performance, memory arrays in microprocessors c 
increase in physical size, which prolongs the access time due to wiring delay. In order to boost 
multiple cycles to complete an access. This delays the scheduling of dependent instructions and 
proposes a different circuit organization to enable fast and slow accesses solely de ... 
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Stefanos Kaxiras, Zhigang Hu, Margaret Martonosi 

May 2001 ACM SIGARCH Computer Architecture News , Proceedings of the 28th ann 

architecture ISCA '01, volume 29 issue 2 
Publisher: ACM Press 

Full text available: "^pd^l.l ZiylB} Additional Information: fuiLcitatiPri, afestract, Merences, t 

Power dissipation is increasingly innportant in CPUs ranging from those intended for mobile use, 
for high-end servers. While the bulk of the power dissipated is dynamic switching power, leakac 
Chipmakers expect that in future chip generations, leakage's proportion of total chip power will 

This paper examines methods for reducing leakage power within the cache memori ... 
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